AITopics | upper layer

Collaborating Authors

upper layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

e4191d610537305de1d294adb121b513-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 20:46:39 GMT

conv layer, reviewer 2, upper layer, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

788d986905533aba051261497ecffcbb-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 01:12:14 GMT

fast memory, hm-ann, slow memory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Merced County > Merced (0.04)
North America > United States > Nevada (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > British Columbia > Vancouver (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)

Add feedback

MLATC: Fast Hierarchical Topological Mapping from 3D LiDAR Point Clouds Based on Adaptive Resonance Theory

Ofuchi, Ryosuke, Toda, Yuichiro, Masuyama, Naoki, Matsuno, Takayuki

arXiv.org Artificial IntelligenceDec-1-2025

This paper addresses the problem of building global topological maps from 3D LiDAR point clouds for autonomous mobile robots operating in large-scale, dynamic, and unknown environments. Adaptive Resonance Theory-based Topological Clustering with Different Topologies (ATC-DT) builds global topological maps represented as graphs while mitigating catastrophic forgetting during sequential processing. However, its winner selection mechanism relies on an exhaustive nearest-neighbor search over all existing nodes, leading to scalability limitations as the map grows. To address this challenge, we propose a hierarchical extension called Multi-Layer ATC (MLATC). MLATC organizes nodes into a hierarchy, enabling the nearest-neighbor search to proceed from coarse to fine resolutions, thereby drastically reducing the number of distance evaluations per query. The number of layers is not fixed in advance. MLATC employs an adaptive layer addition mechanism that automatically deepens the hierarchy when lower layers become saturated, keeping the number of user-defined hyperparameters low. Simulation experiments on synthetic large-scale environments show that MLATC accelerates topological map building compared to the original ATC-DT and exhibits a sublinear, approximately logarithmic scaling of search time with respect to the number of nodes. Experiments on campus-scale real-world LiDAR datasets confirm that MLATC maintains a millisecond-level per-frame runtime and enables real-time global topological map building in large-scale environments, significantly outperforming the original ATC-DT in terms of computational efficiency.

artificial intelligence, node, spatial reasoning, (17 more...)

arXiv.org Artificial Intelligence

2511.22238

Country:

Europe (0.67)
Asia > Japan (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)

Add feedback

Attention Saturation and Gradient Suppression at Inflection Layers: Diagnosing and Mitigating Bottlenecks in Transformer Adaptation

Zixian, Wang

arXiv.org Artificial IntelligenceNov-4-2025

Pre-trained Transformers often exhibit over-confidence in source patterns and difficulty in forming new target-domain patterns during fine-tuning. We formalize the mechanism of output saturation leading to gradient suppression through standard cross-entropy and softmax analysis, showing that gradient suppression at inflection layers confines adaptation to high-level recombination of existing features while preventing low-level reconstruction. We introduce a set of layer-wise diagnostic metrics -- attention entropy (saturation proxy), activation gradient norm, parameter gradient norm, and Delta-CKA under a shared PCA basis -- to identify inflection layers characterized by both low attention entropy and steep gradient decay. Building on these findings, we propose a diagnose-first, inject-light fine-tuning strategy: selectively inserting LoRA adapters at inflection layers to restore suppressed backward signals with minimal parameter overhead. Experiments on BERT-base transfer from SST-2 to Rotten Tomatoes under under-trained and over-trained source regimes reveal that over-trained initialization benefits from inflection-layer LoRA injection, while under-trained initialization suffers performance degradation. When base features are strong, unblocking inflection layers facilitates high-level compositional adaptation; when base features are weak, full-pathway unblocking is required for low-level reconstruction, as supported by joint analysis of layer-wise activation gradients and Delta-CKA dynamics.

artificial intelligence, inflection layer, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.00797

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation

Wu, Qizhen, Chen, Lei, Liu, Kexin, Lu, Jinhu

arXiv.org Artificial IntelligenceAug-28-2025

-- In swarm robotics, confrontation scenarios, including strategic confrontations, require efficient decision-making that integrates discrete commands and continuous actions. Traditional task and motion planning methods separate decision-making into two layers, but their unidirectional structure fails to capture the interdependence between these layers, limiting adaptability in dynamic environments. Here, we propose a novel bidirectional approach based on hierarchical reinforcement learning, enabling dynamic interaction between the layers. This method effectively maps commands to task allocation and actions to path planning, while leveraging cross-training techniques to enhance learning across the hierarchical framework. Furthermore, we introduce a trajectory prediction model that bridges abstract task representations with actionable planning goals. In our experiments, it achieves over 80% in confrontation win rate and under 0.01 seconds in decision time, outperforming existing approaches. Demonstrations through large-scale tests and real-world robot experiments further emphasize the generalization capabilities and practical applicability of our method. I. INTRODUCTION Recent advances in artificial intelligence lead to significant progress in robotics [1], [2], with particular attention given to robotic swarm confrontations [3], [4].

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2504.15876

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

e4191d610537305de1d294adb121b513-AuthorFeedback.pdf

Neural Information Processing SystemsAug-17-2025, 00:36:06 GMT

artificial intelligence, machine learning, upper layer, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

LoRA Is Slower Than You Think

Ko, Seokmin

arXiv.org Artificial IntelligenceJul-15-2025

Low-Rank Adaptation (LoRA) is one of the most widely used techniques for fine-tuning large language models (LLMs). By introducing a small number of trainable low-rank weight matrices, LoRA substantially reduces the number of parameters that need to be updated, offering significant advantages in memory consumption and computational efficiency compared to full fine-tuning. However, we observed that LoRA does not consistently provide speed improvements across all model architectures and training setups. Motivated by this inconsistency, we conduct a comprehensive analysis of LoRA's performance and investigate the underlying factors limiting its speedup. Based on our findings, we propose several methods for more efficient fine-tuning of LLMs. We empirically evaluate these methods and compare them to LoRA, demonstrating that our approach achieves comparable or superior performance while delivering more consistent training speed improvements. Our work offers valuable insights and practical guidelines for practitioners seeking to optimize LLM fine-tuning under resource constraints.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.08833

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Transformers as Multi-task Learners: Decoupling Features in Hidden Markov Models

Hao, Yifan, Ye, Chenlu, Han, Chi, Zhang, Tong

arXiv.org Artificial IntelligenceJun-3-2025

Transformer based models have shown remarkable capabilities in sequence learning across a wide range of tasks, often performing well on specific task by leveraging input-output examples. Despite their empirical success, a comprehensive theoretical understanding of this phenomenon remains limited. In this work, we investigate the layerwise behavior of Transformers to uncover the mechanisms underlying their multi-task generalization ability. Taking explorations on a typical sequence model, i.e, Hidden Markov Models, which are fundamental to many language tasks, we observe that: first, lower layers of Transformers focus on extracting feature representations, primarily influenced by neighboring tokens; second, on the upper layers, features become decoupled, exhibiting a high degree of time disentanglement. Building on these empirical insights, we provide theoretical analysis for the expressiveness power of Transformers. Our explicit constructions align closely with empirical observations, providing theoretical support for the Transformer's effectiveness and efficiency on sequence learning across diverse tasks.

artificial intelligence, machine learning, transformer, (14 more...)

arXiv.org Artificial Intelligence

2506.01919

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Sparsified State-Space Models are Efficient Highway Networks

Song, Woomin, Tack, Jihoon, Mo, Sangwoo, Oh, Seunghyuk, Shin, Jinwoo

arXiv.org Artificial IntelligenceMay-28-2025

State-space models (SSMs) offer a promising architecture for sequence modeling, providing an alternative to Transformers by replacing expensive self-attention with linear recurrences. In this paper, we propose a simple yet effective trick to enhance SSMs within given computational budgets by sparsifying them. Our intuition is that tokens in SSMs are highly redundant due to gradual recurrent updates, and dense recurrence operations block the delivery of past information. In particular, we observe that upper layers of SSMs tend to be more redundant as they encode global information, while lower layers encode local information. Motivated by this, we introduce Simba, a hierarchical sparsification method for SSMs based on token pruning. Simba sparsifies upper layers more than lower layers, encouraging the upper layers to behave like highways. To achieve this, we propose a novel token pruning criterion for SSMs, measuring the global impact of tokens on the final output by accumulating local recurrences. We demonstrate that Simba outperforms the baseline model, Mamba, with the same FLOPS in various natural language tasks. Moreover, we illustrate the effect of highways, showing that Simba not only enhances efficiency but also improves the information flow across long sequences. Code is available at https://github.com/woominsong/Simba.

machine learning, natural language, simba, (18 more...)

arXiv.org Artificial Intelligence

2505.20698

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback